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Abstract 

This is a programmatic paper, marking out two direc- 
tions in which the study of social media can contribute 
to broader problems of social science: understanding 
cultural evolution and understanding collective cogni- 
tion. Under the first heading, I discuss some difficulties 
with the usual, adaptationist explanations of cultural 
phenomena, alternative explanations involving network 
diffusion effects, and some ways these could be tested 
using social-media data. Under the second I describe 
some of the ways in which social media could be used to 
study how the social organization of an epistemic com- 
munity supports its collective cognitive performance. 

Let me begin by considering two 1 senses in which we 
might speak of human thought as being "social" , and 
how they might orient the study of social information 
processing and social media. 

The first sense is a common-place of many schools in 
the social sciences and humanities: our thought relies 
on the cultural transmission of cognitive tools. Every 
individual thinker, no matter how innovative or even 
lonely they may be, depends crucially on a vast array 
of cognitive tools (concepts, procedures, languages, as- 
sumptions, values, ...) which they did not devise them- 
selves, and could not have devised for themselves. In- 
stead they inherited these cognitive tools from interact- 
ing with other people, who for the most part themselves 
did not invent them. (Dewey 1927; Vygotsky 19341986; 
Popper 1945; Balkin 1998) 2 (Whether this dependence 
on tradition is a logical necessity, or merely a reflection 
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*Of course, people think a lot about their own and others' 
social interactions, and a big use of social media is sharing 
these thoughts. But in this social media are no different 
from any other form of human, or for that matter primate, 
association. 

2 "[K]nowledge is a function of association and communi- 
cation; it depends upon tradition, upon tools and methods 
socially transmitted, developed and sanctioned. Faculties 
of effectual observation, reflection and desire are habits ac- 
quired under the influence of the culture and institutions of 
society, not ready-made inherent powers" (Dewey 1927, p. 
158). Cf. (Popper 1945, ch. 23-24). 



of our peculiar bounded rationality and bounded lifes- 
pan, is a deep question, fortunately not relevant here.) 
While individual thinkers invent and discover, it is 
nonetheless true that innovations are typically refined, 
extended and perfected by groups, and that it is very 
rare indeed for highly developed concepts and ideas to 
emerge from a single, isolated thinker, rather than from 
a process of interaction (Toulmin 1972; Kitcher 1993; 
Collins 1998; Ziman 2000). 

The branches of social science for which these facts 
are common-places have largely developed them philo- 
sophically (Toulmin 1972; Turner 2002), or qualita- 
tively (Vygotsky 19341986; 1978; Balkin 1998; Mercer 
2000) or even cthnographically (Luria 1976; Hutchins 

1995) . (But see (Lupia, McCubbins, & Popkin 2000).) 
In part this has been for reasons of cultural and in- 
tellectual politics, as the relevant scholars have tended 
to fall on the "interpretation" rather than "explana- 
tion" side of the divide in the social sciences (Sperber 

1996) , so that attention to the social nature of thought 
often goes along with more or less pronounced hostil- 
ity to quantitative and computational modeling (e.g. 
Hutchins; Mercer). This supposed opposition is thor- 
oughly mis-guided (Frawley 1997), but it is not likely 
that anyone will be argued out of it any time soon. 

More promisingly, however, one good reason for de- 
veloping this idea through small-scale qualitative stud- 
ies has been that it was impossible to gather relevant 
data, suitable for quantitative analysis, on any large 
scale. With the rise of social media, however, many 
people are, for their own purposes, generating exactly 
this kind of data for us — traces of their communica- 
tive interactions as they work out their thoughts about 
matters of common concern. They are doing so on a 
wide range of subjects, under a wide range of different 
institutional mechanisms which structure their interac- 
tions in many different ways, creating natural sources of 
variation which the social scientist can try to exploit to 
learn more about the effects of subject matter, of com- 
municative structure, and of other factors on cultural 
dynamics, and perhaps ultimately even on innovation 
and discovery. The next section points out some of 
the outstanding problems and methodological pitfalls 
of this area. 



The other important sense in which human thought 
can be "social" is that it seems to make sense to regard 
at least some human social institutions as, themselves, 
information-processing systems, engaging in computa- 
tions which cannot be localized to representations in 
the mind of any one of their members. On large scales, 
market economies, corporations and other bureaucra- 
cies, scientific disciplines, and democratic polities all 
have something of this collective information-processing 
character. Knowing how they accomplish this would 
be deeply rewarding, and, if that understanding can 
be used to make them work better, of profound eco- 
nomic and political importance. A frontal assault on 
this problem, as represented by one of those grand insti- 
tutions, is unlikely to succeed (though it may be a mag- 
nificent failure). Fortunately, social information pro- 
cessing also occurs in much humbler institutions, such 
as tagging systems and collaborative filtering, where is- 
sues of data collection and even experimental manipu- 
lation are much more manageable, and where we might 
hope to learn more, before tackling the fundamental 
problems of social science. I will lay out some of what 
should be on the agenda of the study of social infor- 
mation processing, in particular points of contact with 
machine learning. 

Cultural Evolution 

"Culture is the precipitate of cognition and communi- 
cation in a human population" (Sperber 1996). That 
is, cultural traits — beliefs, practices, habits, conven- 
tions, expressions, norms — are not just ones which are 
common across a population, but ones which are spread 
across a population because its members communicate 
with one another. (Knowing that it's painful to look at 
the sun directly is not cultural; knowing that the direc- 
tion in which the sun rises is called "east" is cultural.) 
Cultural phenomena arc thus emergent, the result of 
the communicative interaction of cognitive agents. If 
we are to understand how cultures work, we need to 
understand something about both parts, the internal 
cognitive mechanisms and the effects of different pat- 
terns of interaction. Social media offer a window into 
the communicative part of the problem of unrivaled 
clarity and breadth. This is extremely exciting, but 
in looking through this window we should bear in mind 
some methodological difficulties to interpreting the view 
through this window. 

It is a common-place observation that there are 
strong relationships between cultural traits and so- 
cial attributes; that different social groups accept and 
transmit different bits of culture. Most attempts to 
explain this from within the social sciences (emphat- 
ically including historical materialism (Elster 1985; 
Cohen 2000) and its variants) argue that this is due 
to some causal influence of social organization on the 
content of culture. ("Social being determines con- 
sciousness" (Marx & Engels 18471947) — or, once 
the Hegelian gas has been released, social life shapes 
thought.) In these views, culture varies with social po- 



sition because the former is adapted to the latter, or 
reflects it, or expresses it. 

It is natural for us, as beings acutely sensitive to nu- 
ances of cultural meanings, to try to explain cultural 
differences by trying to explain the content of widely- 
shared, cultural representations. It is natural to sup- 
pose that, say, one news story rises to the top of a 
social aggregation system because it is more interesting 
than other stories which did not. Such explanations are 
even valid a lot of the time. It is nonetheless important, 
as a point of methodological hygiene, to develop ways 
of telling when some bit of culture succeeds in propa- 
gating because its content fits its circumstances, if only 
because, being creatures acutely sensitive to nuances of 
cultural meanings, it is far too easy for us to spin such 
stories no matter what the truth might be. Licbcrson 
(2000) points out that many widely-accepted explana- 
tions of trends in fashions, children's names, etc., can- 
not possibly be right (because, e.g., the trend pre-dates 
or is more widely spread than the supposed cause), and 
that these are instead better explained by purely in- 
ternal mechanisms of the respective fields. In biology, 
adaptive and non-adaptive evolution are demarcated by 
means of neutral models. These are models of the ge- 
netic changes which would be expected due to reproduc- 
tive mechanisms and chance alone, all genetic variants 
being assumed to be "adaptively neutral", i.e., of equal 
fitness. Only when actual populations depart markedly 
from the predictions of neutral models can adapta- 
tion be (reliably) inferred (Nitecki & Hoffman 1987; 
Harvey & Pagel 1991). Before the student of social 
media, or other cultural media, can start explaining 
phenomena by reference to content, they need to check 
that there actually is something to be explained. 

A highly simplistic model may make this point more 
concrete. Consider a network in which people have two 
binary traits, one of which is stable (we may think 
of this as "class" or "race" or some similar status), 
and the other is changeable (think of fashions, or po- 
litical opinions). Assume that the network is assor- 
tative on the stable, social-type trait, so that people 
arc more likely to be linked to others of the same 
type than those of a different type. Such "assorta- 
tivity" or "homophily" is observed in many, perhaps 
most social networks, often on such stable social-status 
type variables (McPherson, Smith-Lovin, & Cook 2001; 
Newman 2003). Now assign the cultural trait to peo- 
ple uniformly and independently of their social trait (or 
anything else). Initially, then, there will be no correla- 
tion between social and cultural traits, and no assorta- 
tivity based on culture. 

We might expect such correlations to appear if the 
process of cultural transmission and retention is biased 
- if, say, certain cultural values only make sense for 
those in certain social positions. In that case, we would 
expect to find a growing "fit" between cultural and so- 
cial variables, as the former adapt to the latter. But by 
this point you will not be surprised to learn that neu- 
tral transmission processes can also induce such cor- 
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Figure 1 : Neutral copying induces correlations between so- 
cial and cultural traits in assortative networks. The graph 
has 100 nodes, randomly divided between two social types 
(equally probable), and a binary- valued cultural trait (ini- 
tially equally probable). Edges between nodes of the same 
type occur with probability p\ , those between different types 
have probability p2- At each time step, a random node 
copies the cultural trait of a random neighbor. Horizontal 
axis: time. Vertical axis: \ 2 statistic for the correlation be- 
tween the social and cultural variables. Black line: behavior 
of an assortative network (pi = 0.09, P2 = 0.01, assortativ- 
ity coefficient (Newman 2003) of realized graph r — 0.80). 
Note the eventual decline of \ 2 as tne network moves to- 
wards a homogeneous equilibrium; in the very long run it 
will reach 0. Grey line: behavior of a non-assortative net- 
work (pi = p 2 = 0.05, r = 0.045). 

relations. To be specific, let's implement the "voter 
model" (Liggett 1985): at each discrete time step, a 
node is chosen uniformly at random, independently of 
past and future choices. This node chooses a neigh- 
bor (again uniformly and independently), and copies 
its value of the cultural trait. 

Clearly, in a connected network, there are two ab- 
sorbing states, which are culturally homogeneous, and 
eventually the network must settle into one or the other 
of them, but the time it takes to do so will typically 
be quite long (Sood & Redner 2005). In the mean- 
while, if the network is socially assortative, numerical 
experiments (Fig. 1) show that the social and the cul- 
tural traits tend to become correlated during a long 
"meta-stable" period. 3 If I'd said that the social types 

3 One could say that the cultural trait of a node is still 
"in the final analysis" determined by its social type, but 
only with the proviso that the over-all structure of the net- 
work "screens off" the latter, rendering it causally irrelevant 



were "lower class" and "middle class" , and the cultural 
traits "likes black velvet paintings" and "likes black and 
white photographs" , the temptation to explain the cor- 
relation by content would be overwhelming (Bourdieu 
1984). Nonetheless, which way the correlation went 
would be a matter of pure chance, or more exactly of 
the reinforcement and amplification of small fluctua- 
tions, though some such pattern forms with high prob- 
ability. (This contrast between long- and medium- run 
behavior is not uncommon in self-reinforcing network 
processes (Pemantle & Skyrms 2004).) The strength 
of the dynamically-induced correlations depends on the 
assortativity of the social network; if it is not assorta- 
tive, then the correlations between social and cultural 
traits only rarely rise above the levels to be expected 
by chance (Fig. 1). 

Social scientists interested in communications have 
appreciated for a long time that network structure is 
very important to how information flows through a so- 
cial group (Katz & Lazarsfeld 1955; Huckfeldt, Johnson, 
& Sprague 2004), but they have not, so far as I know, re- 
alized that it can create just the kind of correlation that 
seems to cry out for an explanation by content. In fact, 
the real situation is somewhat worse than this, because 
it really isn't a given that people change because of in- 
teracting with their neighbors. It could well be that 
people have neighbors who are similar to themselves, 
and so they all respond similarly to common exoge- 
nous causes, without any direct interactions (Steglich, 
Snijders, & Pearson 2004). 4 If one thinks of trying to 
explain why certain users prefer certain kinds of news 
stories, for example, one must account not only assor- 
tativity, but also for common exposure to some outside 
news source. None of this, incidentally, requires that 
people actually make decisions randomly, but only that 
the reasons which lead them to their decisions are ef- 
fectively unpredictable from the other variables in the 
system. 

The moral is not that these kinds effect explain all 
correlations between social and cultural traits, or even 
between different cultural traits. Rather, it shows that 
a neutral explanation is logically possible. To support 
an adaptive explanation of a correlation, then, one must 
show some way in which the neutral model is not ad- 
equate to the data. For example, additional experi- 
ments (not shown) indicate that, if I take the model 
simulated in Fig. 1 and break the graph into commu- 
nities (following Newman & Girvan (2003)), then so- 
cial type and cultural traits are conditionally indepen- 
dent, given community membership, even in strongly- 
assortative networks. This conditional independence 
does not hold when different social types have differing 
biases for or against various cultural traits. Only when 
we have found and verified such discrepancies between 



(Galles & Pearl 1997). 

4 This possibility seems to confound the claims of the re- 
cent, and widely-publicized, study of the spread of obesity 
in a social network (Christakis & Fowler 2007). 



our data and the predictions of a good neutral model 
can we say that the adaptive explanation has passed a 
severe test and truly has evidence in its support (Mayo 
1996). 

Collective Cognition 

It's been recognized since the 1930s that market 
economies are "collective calculating devices" (Langc 
& Taylor 1938; Hayek 1948). A market-clearing allo- 
cation of good and services is simply too big for any- 
one to grasp, let alone find. Instead it is the process 
of exchange itself which adaptively finds and imple- 
ments this allocation. 5 This is an example of what 
we might call collective cognition, by analogy to the 
classical (Mancur Olson 1971) "collective action". Sim- 
ilarly, the problems of designing policies for govern- 
ments are largely beyond the scope of what anyone 
can actually do, but not beyond the scope of demo- 
cratic deliberation, which reduces the problem from 
solving for the optimal policy in one stroke, to criti- 
cizing and improving policies piecemeal (Braybrookc & 
Lindblom 1963), in light of the information and ideas 
of many participants. (Popper 1945; Lindblom 1965; 
Ober 2005) (Historically, democratic decision-making 
has been associated with more social power than other 
forms of government (McNeill 1982), but the causality 
is unclear.) Similar remarks apply to bureaucratic or- 
ganizations, such as corporations, and to scientific dis- 
ciplines. 

It is notable that modern societies are vastly better 
at collective cognition than earlier ones. The degree 
of organization, and its precision, which we take for 
granted would have been astonishing for even the in- 
habitants of the most advanced societies c. 1600, to 
say nothing of c. 100. Historians have explored some 
of the technical and institutional underpinnings of these 
organizational revolutions (McNeill 1982; Beniger 1986; 
Yates 1989), but at a deeper level we have little idea why 
this is so, or why what we do works (when it does work). 
This makes it harder to improve the functioning of our 
institutions for collective cognition. Economic theories 
of mechanism design attempt to do so, but largely ad- 
dress the problem of motivating people to act in certain 
ways, rather than of how to figure out what the right 
action is (Miller 1992). 

These are all very large themes indeed, of course, and 
it might seem grandiose to even mention them in this 
context. I am not suggesting that studying social me- 
dia will give us the key to all organization technologies. 
What it can do, however, is give us a set of case studies 
where, on a much humbler level, people are nonetheless 
engaged in social information processing and collective 
cognition. Just as no one market participant decides on 
or represents the over-all market allocation, and no one 
scholar ever grasps more than a small portion of what 
is known about conic sections or cellular slime molds, 

5 On the formal computational power of market-like sys- 
tems, see (Walsh et al. 2003). 



the movies or bookmarks which get recommended by 
collaborative filtering services are the emergent prod- 
ucts of the interactions of many participants (Lerman 
2007). What social media offer us, again, is the possi- 
bility to automatically collect large-scale data on such 
phenomena, combined with a clear understanding of 
the interaction structure (or at least a lot of it), as well 
as much of the external circumstances and the goals 
of the group. We can thus begin, at least at a small 
scale, to begin building and systematically testing the- 
ories which explain how social information processing 
and collective cognition succeed when they do. 

It might be thought that the theoretical explanation 
is rather simple, and goes (currently) under the name 
of "the wisdom of crowds" (Surowiccki 2004): individ- 
uals make noisy guesses, which on average are unbiased 
and uncorrelated, so simple averaging leads to conver- 
gence on the appropriate answer. Taken seriously, this 
explanation implies that our economy, our sciences and 
our polities manage to work despite their social organi- 
zation, that science (for example) would progress much 
faster if scientists did not collaborate, did not read each 
others' papers, etc. While every scientist feels this way 
occasionally, it is hard to take seriously. Clearly, there 
has to be an explanation for the success of social in- 
formation processing other than averaging uncorrelated 
guesses, something which can handle, and perhaps even 
exploit, statistical dependence between decision mak- 
ers. 

A particularly interesting line of attack on these prob- 
lems is suggested by the analogy with ensemble meth- 
ods in machine learning. As Domingos (1999) has 
pointed out, the success of these methods seems to con- 
found naive interpretations of Occam's Razor, in much 
the same way that the success of social information pro- 
cessing confounds the simple "wisdom of the crowds" 
story. Ensemble methods, in which large numbers of 
low-capacity classifiers or predictors (e.g., shallow clas- 
sification trees) are combined, effectively create a sin- 
gle model of what appears to be very high capacity, 
and so they appear to be nothing but an invitation to 
over-fitting. Worse, typically ensemble methods such as 
boosting (Hastic, Tibshirani, & Friedman 2001), bag- 
ging (Breiman 1996) and mixtures of experts (Jacobs 
1997) create correlated low-level predictors, so that the 
simple average-the-crowd story is inapplicable. In fact, 
it is precisely because the component predictors are cor- 
related, but not identical, that the actual capacity of the 
ensemble is much smaller than its apparent capacity. 

A similar result holds for cooperative problem-solving 
(Hong & Page 2004). Under mild conditions, it can be 
shown that a large group of "weak" heuristic problem- 
solvers, whose performance in isolation is only slightly 
better than random search, will actually out-perform a 
similarly-sized group of "strong" heuristics, ones whose 
average performance in isolation is much better. One of 
those conditions, however, is that the problem-solvers 
must be able to communicate with each other, mak- 
ing their candidate solutions strongly dependent rather 



than uncorrelated. There is good evidence that this 
beneficial effect of heuristic diversity and communica- 
tion is actually seen in the cognitive performance of hu- 
man groups (Page 2007) . This suggests a very promis- 
ing direction for research on social information pro- 
cessing, namely to use the mathematical techniques 
of statistical learning theory to establish bounds on 
the performance of suitable sorts of ensemble-learners 
and group problem-solvers, and see how close actual 
social information processing systems come to attain 
those bounds, and how the latter could be improved by 
changes to their architectures. 

Both ensemble methods and the Hong & Page re- 
sults on diverse heuristics posit relatively simple forms 
of "social" organization, such as direct averaging, or 
passing a problem to the next person able to improve 
on the current solution. There is every reason to think, 
however, that the optimal form of organization will ac- 
tually depend on the structure of the problem being 
solved. (Cf. Braybrooke & Lindblom (1963) on how the 
social organization of policy analysts serves their cog- 
nitive strategy of "disjointed incrementalism.") In par- 
ticular, coordination over time is not an issue in ensem- 
ble methods, and handled by assumption in the Hong 
& Page model, but extremely important in real-world 
systems for social information processing and collective 
cognition. 

This suggests a final line of research, one which draws 
together ideas from distributed systems, economics and 
statistical mechanics. Experience with distributed sys- 
tems shows that often the hardest part of their design is 
ensuring coordination over time, and that failure to do 
so can lead to all manner of unwanted behavior, in par- 
ticular to wild oscillations and/or locking into deeply 
undesirable configurations (Lynch 1996). In fact, the 
failure modes of distributed systems are strongly remi- 
niscent of the pathologies of economic (Chamley 2004) 
and statistical-mechanical (Young 1998) models of so- 
cial learning, when they are placed in suitable (that is, 
unsuitable) situations. Designing, or reforming, a sys- 
tem for computer-mediate social information processing 
is at once a problem of distributed algorithm design and 
a problem of mechanism design, and they two modes or 
aspects should inform one another, as well as empirical 
results about what actually happens when real human 
beings use different systems for different tasks. 
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